AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-9-2026, 11:47:28 GMT

GraspProposalNetworks: AnEnd-to-EndSolution forVisualLearningofRoboticGrasps

Recent research shows its great potential by preparing and learning from large-scale synthetic datasets.

artificial intelligence, exj, machine learning, (17 more...)

Country:

North America > United States (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceDec-11-2025

Benchmarking World-Model Learning

Warrier, Archana, Nguyen, Dat, Naim, Michelangelo, Jain, Moksh, Liang, Yichao, Schroeder, Karen, Yang, Cambridge, Tenenbaum, Joshua B., Vollmer, Sebastian, Ellis, Kevin, Tavares, Zenna

Model-learning agents should gather information to learn world models that support many downstream tasks and inferences, such as predicting unobserved states, estimating near- and far-term consequences of actions, planning action sequences, and detecting changes in dynamics. Current methods for learning and evaluating world models diverge from this goal: training and evaluation are anchored to next-frame prediction, and success is scored by reward maximization in the same environment. We propose WorldTest, a protocol to evaluate model-learning agents that separates reward-free interaction from a scored test phase in a different but related environment. WorldTest is open-ended $\unicode{x2014}$ models should support many different tasks unknown ahead of time $\unicode{x2014}$ and agnostic to model representation, allowing comparison across approaches. We instantiated WorldTest with AutumnBench, a suite of 43 interactive grid-world environments and 129 tasks across three families: masked-frame prediction, planning, and predicting changes to the causal dynamics. We compared 517 human participants and three frontier models on AutumnBench. We found that humans outperform the models, and scaling compute improves performance only in some environments but not others. WorldTest provides a novel template $\unicode{x2014}$ reward-free exploration, derived tests, and behavior-based scoring $\unicode{x2014}$ to evaluate what agents learn about environment dynamics, and AutumnBench exposes significant headroom in world-model learning.

agent, artificial intelligence, machine learning, (16 more...)

2510.19788

Country: North America (0.27)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Neural Information Processing SystemsAug-15-2025, 13:33:22 GMT

a378383b89e6719e15cd1aa45478627c-Supplemental.pdf

dataset, probability, sequence, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Wasserroth, Fenya, Avramidis, Eleftherios, Czehmann, Vera, Kojic, Tanja, Nunnari, Fabrizio, Möller, Sebastian

Evaluation of a Sign Language Avatar on Comprehensibility, User Experience \& Acceptability

arXiv.org Artificial IntelligenceAug-12-2025

This paper presents an investigation into the impact of adding adjustment features to an existing sign language (SL) avatar on a Microsoft Hololens 2 device. Through a detailed analysis of interactions of expert German Sign Language (DGS) users with both adjustable and non-adjustable avatars in a specific use case, this study identifies the key factors influencing the comprehensibility, the user experience (UX), and the acceptability of such a system. Despite user preference for adjustable settings, no significant improvements in UX or comprehensibility were observed, which remained at low levels, amid missing SL elements (mouthings and facial expressions) and implementation issues (indistinct hand shapes, lack of feedback and menu positioning). Hedonic quality was rated higher than pragmatic quality, indicating that users found the system more emotionally or aesthetically pleasing than functionally useful. Stress levels were higher for the adjustable avatar, reflecting lower performance, greater effort and more frustration. Additionally, concerns were raised about whether the Hololens adjustment gestures are intuitive and easy to familiarise oneself with. While acceptability of the concept of adjustability was generally positive, it was strongly dependent on usability and animation quality. This study highlights that personalisation alone is insufficient, and that SL avatars must be comprehensible by default. Key recommendations include enhancing mouthing and facial animation, improving interaction interfaces, and applying participatory design.

artificial intelligence, avatar, natural language, (16 more...)

doi: 10.1145/3742886.3756719

2508.05358

Country:

Europe > Germany (0.72)
North America > United States (0.69)

Genre:

Research Report > New Finding (0.86)
Research Report > Experimental Study > Negative Result (0.68)

Industry: Education > Curriculum > Subject-Specific Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Human Computer Interaction > Interfaces (0.94)

arXiv.org Artificial IntelligenceJun-27-2025

Utility-Driven Speculative Decoding for Mixture-of-Experts

Saxena, Anish, Tsai, Po-An, Taneja, Hritvik, Jaleel, Aamer, Qureshi, Moinuddin

GPU memory bandwidth is the main bottleneck for low-latency Large Language Model (LLM) inference. Speculative decoding leverages idle GPU compute by using a lightweight drafter to propose K tokens, which the LLM verifies in parallel, boosting token throughput. In conventional dense LLMs, all model weights are fetched each iteration, so speculation adds no latency overhead. Emerging Mixture of Experts (MoE) models activate only a subset of weights per token, greatly reducing data movement. However, we show that speculation is ineffective for MoEs: draft tokens collectively activate more weights, increasing data movement and verification time by 2-3x. When token throughput gains fail to offset this overhead, speculation causes slowdowns up to 1.5x, making it infeasible. Even when useful, the optimal K varies by task, model, and even between requests and iterations. Thus, despite widespread use in dense LLMs, speculation remains impractical in leading MoEs. We present Cascade, a utility-driven framework that selectively enables speculation to avoid slowdowns and dynamically tunes K to accelerate MoE serving. Cascade uses a lightweight metric, speculation utility, the ratio of token gains to verification cost, which shows iteration-level locality, enabling periodic decisions via short test and longer set phases. For each request, Cascade disables speculation if utility drops below one during testing, and when utility exceeds one, tests multiple K-values to choose the utility-maximizing K for the set phase. We implement Cascade in vLLM and evaluate it on five popular MoEs with workloads spanning code, math, extraction, and mixed tasks. Cascade limits slowdown to 5% (vs. 1.5x) and improves throughput by 7-14% over static K, making speculative decoding practical for MoEs.

large language model, machine learning, speculation, (18 more...)

2506.20675

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Neural Information Processing SystemsJan-27-2025, 17:51:09 GMT

Review for NeurIPS paper: Calibrating CNNs for Lifelong Learning

Summary and Contributions: Update: My initial review noted two main issues with the paper: reliance on the initial model, and the use of task labels during the test phase. The author response addresses the first question, but misses the point on the second one. And this alone is not sufficient to strongly influence my overall rating. In my understanding, several previous methods, such as LwF, iCaRL highlighted in the author response, classify samples without the knowledge of which group of classes (i.e., old or new) they belong to. In other words, they only use a single framework that can identify samples from any of the old or the new classes, without additional information.

calibration module, task label, test phase, (8 more...)

Genre: Instructional Material (0.40)

Industry: Education > Educational Setting > Continuing Education (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.67)

Neural Information Processing SystemsJan-27-2025, 17:51:01 GMT

Review for NeurIPS paper: Calibrating CNNs for Lifelong Learning

The paper proposes a continual learning approach for CNN models. This is achieved through spatial and channel-wise calibration modules, one for each new task. These calibration modules are introduced between each pair of consecutive layers in the original base model. The base model is learnt on the first task, and training data from the subsequent tasks is used to learn the calibration modules. Extensive experiments show the superiority of the proposed method in terms of accuracies, with minimal computation and storage overhead. It is important to emphasize that the proposed approach requires task labels in the test phase.

calibrating cnn, calibration module, lifelong learning, (4 more...)

Genre: Instructional Material (0.40)

Industry: Education > Educational Setting > Continuing Education (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsJan-18-2025, 10:15:30 GMT

Is Heterogeneity Notorious? Taming Heterogeneity to Handle Test-Time Shift in Federated Learning

heterogeneity, inter-client heterogeneity, intra-client heterogeneity, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJan-12-2025

SELMA3D challenge: Self-supervised learning for 3D light-sheet microscopy image segmentation

Chen, Ying, Al-Maskari, Rami, Horvath, Izabela, Ali, Mayar, Hoher, Luciano, Yang, Kaiyuan, Lin, Zengming, Zhai, Zhiwei, Shen, Mengzhe, Xun, Dejin, Wang, Yi, Xu, Tony, Goubran, Maged, Wu, Yunheng, Mori, Kensaku, Paetzold, Johannes C., Erturk, Ali

Recent innovations in light sheet microscopy, paired with developments in tissue clearing techniques, enable the 3D imaging of large mammalian tissues with cellular resolution. Combined with the progress in large-scale data analysis, driven by deep learning, these innovations empower researchers to rapidly investigate the morphological and functional properties of diverse biological samples. Segmentation, a crucial preliminary step in the analysis process, can be automated using domain-specific deep learning models with expert-level performance. However, these models exhibit high sensitivity to domain shifts, leading to a significant drop in accuracy when applied to data outside their training distribution. To address this limitation, and inspired by the recent success of self-supervised learning in training generalizable models, we organized the SELMA3D Challenge during the MICCAI 2024 conference. SELMA3D provides a vast collection of light-sheet images from cleared mice and human brains, comprising 35 large 3D images-each with over 1000^3 voxels-and 315 annotated small patches for finetuning, preliminary testing and final testing. The dataset encompasses diverse biological structures, including vessel-like and spot-like structures. Five teams participated in all phases of the challenge, and their proposed methods are reviewed in this paper. Quantitative and qualitative results from most participating teams demonstrate that self-supervised learning on large datasets improves segmentation model performance and generalization. We will continue to support and extend SELMA3D as an inaugural MICCAI challenge focused on self-supervised learning for 3D microscopy image segmentation.

artificial intelligence, machine learning, segmentation, (19 more...)

2501.0388

Country:

Europe > Germany (0.29)
Asia > China (0.28)
North America > Canada (0.28)
(2 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)